Search CORE

717 research outputs found

TGSum: Build Tweet Guided Multi-Document Summarization Dataset

Author: Cao Ziqiang
Chen Chengyao
Li Sujian
Li Wenjie
Wei Furu
Zhou Ming
Publication venue
Publication date: 26/11/2015
Field of study

The development of summarization research has been significantly hampered by the costly acquisition of reference summaries. This paper proposes an effective way to automatically collect large scales of news-related multi-document summaries with reference to social media's reactions. We utilize two types of social labels in tweets, i.e., hashtags and hyper-links. Hashtags are used to cluster documents into different topic sets. Also, a tweet with a hyper-link often highlights certain key points of the corresponding document. We synthesize a linked document cluster to form a reference summary which can cover most key points. To this aim, we adopt the ROUGE metrics to measure the coverage ratio, and develop an Integer Linear Programming solution to discover the sentence set reaching the upper bound of ROUGE. Since we allow summary sentences to be selected from both documents and high-quality tweets, the generated reference summaries could be abstractive. Both informativeness and readability of the collected summaries are verified by manual judgment. In addition, we train a Support Vector Regression summarizer on DUC generic multi-document summarization benchmarks. With the collected data as extra training resource, the performance of the summarizer improves a lot on all the test sets. We release this dataset for further research.Comment: 7 pages, 1 figure in AAAI 201

arXiv.org e-Print Archive

CiteSeerX

The Hong Kong Polytechnic University Pao Yue-kong Library

Association for the Advancement of Artificial Intelligence: AAAI Publications

Inhomogeneous states with checkerboard order in the t-J Model

Author: Chunhua Li
I. Affleck
M. Vershinin
Sen Zhou
Ziqiang Wang
Publication venue: 'American Physical Society (APS)'
Publication date: 27/02/2006
Field of study

We study inhomogeneous states in the t-J model using an unrestricted Gutzwiller approximation. We find that

pa\times pa

checkerboard order, where

p

is a doping dependent number, emerges from Fermi surface instabilities of both the staggered flux phase and the Fermi liquid state with realistic band parameters. In both cases, the checkerboard order develops at wave vectors

(\pm 2\pi/pa,0)

(0,\pm2\pi/pa)

that are tied to the peaks of the wave-vector dependent susceptibility, and is of the Lomer-Rice-Scott type. The properties of such periodic, inhomogeneous states are discussed in connection to the checkerboard patterns observed by STM in underdoped cuprates.Comment: Published Versio

arXiv.org e-Print Archive

Crossref

Multi-Document Summarization via Discriminative Summary Reranking

Author: Cao Ziqiang
Li Sujian
Wan Xiaojun
Wei Furu
Zhou Ming
Publication venue
Publication date: 08/07/2015
Field of study

Existing multi-document summarization systems usually rely on a specific summarization model (i.e., a summarization method with a specific parameter setting) to extract summaries for different document sets with different topics. However, according to our quantitative analysis, none of the existing summarization models can always produce high-quality summaries for different document sets, and even a summarization model with good overall performance may produce low-quality summaries for some document sets. On the contrary, a baseline summarization model may produce high-quality summaries for some document sets. Based on the above observations, we treat the summaries produced by different summarization models as candidate summaries, and then explore discriminative reranking techniques to identify high-quality summaries from the candidates for difference document sets. We propose to extract a set of candidate summaries for each document set based on an ILP framework, and then leverage Ranking SVM for summary reranking. Various useful features have been developed for the reranking process, including word-level features, sentence-level features and summary-level features. Evaluation results on the benchmark DUC datasets validate the efficacy and robustness of our proposed approach

arXiv.org e-Print Archive

CiteSeerX